PlantTFDB
Plant Transcription Factor Database
v4.0
Previous version: v3.0
Transcription Factor Information
Basic Information | Signature Domain | Sequence | 
Basic Information? help Back to Top
TF ID Thecc1EG015114t2
Common NameTCM_015114
Organism
Taxonomic ID
Taxonomic Lineage
cellular organisms; Eukaryota; Viridiplantae; Streptophyta; Streptophytina; Embryophyta; Tracheophyta; Euphyllophyta; Spermatophyta; Magnoliophyta; Mesangiospermae; eudicotyledons; Gunneridae; Pentapetalae; rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma
Family HD-ZIP
Protein Properties Length: 748aa    MW: 82174.7 Da    PI: 6.1082
Description HD-ZIP family protein
Gene Model
Gene Model ID Type Source Coding Sequence
Thecc1EG015114t2genomeCGDView CDS
Signature Domain? help Back to Top
Signature Domain
No. Domain Score E-value Start End HMM Start HMM End
1Homeobox64.12e-2054109156
                       TT--SS--HHHHHHHHHHHHHSSS--HHHHHHHHHHCTS-HHHHHHHHHHHHHHHH CS
          Homeobox   1 rrkRttftkeqleeLeelFeknrypsaeereeLAkklgLterqVkvWFqNrRakek 56 
                       ++k +++t++q++eLe++F+++++p++++r eL+++l L+ +q+k+WFqNrR+++k
  Thecc1EG015114t2  54 KKKYHRHTPHQIQELESFFKECPHPDEKQRLELSRRLALESKQIKFWFQNRRTQMK 109
                       79999************************************************999 PP

2START173.51.3e-542554772205
                       HHHHHHHHHHHHHHHC-TT-EEEE....EXCCTTEEEEEEESSS......SCEEEEEEEECCSCHHHHHHHHHCCCGGCT-TT-S....EEEE CS
             START   2 laeeaaqelvkkalaeepgWvkss....esengdevlqkfeeskv.....dsgealrasgvvdmvlallveellddkeqWdetla....kaet 81 
                       +a++a++el+k+++ ++p+W k      e +n +e++++f++  +     + +ea r++g+v+     lve+l+d + +W e+++    +++t
  Thecc1EG015114t2 255 IALAAMDELIKMVQMDSPLWIKGLdggmETLNHEEYRRTFSSCIGmkpsgYATEATRETGLVFLRGLALVETLMDAN-RWAEMFPcmisRVAT 346
                       6899************************************98888********************************.*************** PP

                       EEEECTT......EEEEEEEEXXTTXX-SSX.EEEEEEEEEEE.TTS-EEEEEEEEE-TTS--.-TTSEE-EESSEEEEEEEECTCEEEEEEE CS
             START  82 levissg......galqlmvaelqalsplvp.RdfvfvRyirqlgagdwvivdvSvdseqkppesssvvRaellpSgiliepksnghskvtwv 167
                       ++v+ss+      ++lq+m ae+q+lsplvp R + f+R+++q+++ +w++vdvS+d  q+  + + +  +++lpSg++i++++n +skvtwv
  Thecc1EG015114t2 347 IDVLSSAtgvtrdNTLQVMDAEFQVLSPLVPvRQVRFLRFCKQHTERVWAVVDVSIDASQDAASAQMFPNCRRLPSGCVIQDMDNKYSKVTWV 439
                       **************************************************************9888899************************ PP

                       E-EE--SSXXHHHHHHHHHHHHHHHHHHHHHHTXXXXX CS
             START 168 ehvdlkgrlphwllrslvksglaegaktwvatlqrqce 205
                       eh +++++ +h llr+l+++g  +ga +w+atlqrqc 
  Thecc1EG015114t2 440 EHSEYDDSAVHHLLRPLLSYGFGFGAHRWLATLQRQCD 477
                       ************************************96 PP

Protein Features ? help Back to Top
3D Structure
Database Entry ID E-value Start End InterPro ID Description
Gene3DG3DSA:1.10.10.602.0E-2140111IPR009057Homeodomain-like
SuperFamilySSF466895.85E-2040111IPR009057Homeodomain-like
PROSITE profilePS5007117.50851111IPR001356Homeobox domain
SMARTSM003896.2E-1753115IPR001356Homeobox domain
PfamPF000464.7E-1854109IPR001356Homeobox domain
CDDcd000862.16E-1854111No hitNo description
PROSITE patternPS00027086109IPR017970Homeobox, conserved site
PROSITE profilePS5084837.517245481IPR002913START domain
SuperFamilySSF559618.65E-30246478No hitNo description
CDDcd088751.76E-110249476No hitNo description
SMARTSM002341.2E-34254478IPR002913START domain
PfamPF018522.1E-46255477IPR002913START domain
SuperFamilySSF559613.3E-18506741No hitNo description
Gene Ontology ? help Back to Top
GO Term GO Category GO Description
GO:0006355Biological Processregulation of transcription, DNA-templated
GO:0005634Cellular Componentnucleus
GO:0008289Molecular Functionlipid binding
GO:0043565Molecular Functionsequence-specific DNA binding
Sequence ? help Back to Top
Protein Sequence    Length: 748 aa     Download sequence    Send to blast
MDAHGEMGLI GENFDPGLVG RMKEDGYESR SGSDNFEGAS GDDQDAADDG RPKKKKYHRH  60
TPHQIQELES FFKECPHPDE KQRLELSRRL ALESKQIKFW FQNRRTQMKT QLERHENVIL  120
RQENDKLRAE NDLLKQAMSS PTCNSCGGPA VPGEISYEQH QLRIENARLK DELNRICALT  180
NKFLGRPLSS SASPIPSQGL NSNLELAVGR NDFGGLNNAG TTLPMGFDFV DGAMMPLMKT  240
MANEMPYDRS ALVDIALAAM DELIKMVQMD SPLWIKGLDG GMETLNHEEY RRTFSSCIGM  300
KPSGYATEAT RETGLVFLRG LALVETLMDA NRWAEMFPCM ISRVATIDVL SSATGVTRDN  360
TLQVMDAEFQ VLSPLVPVRQ VRFLRFCKQH TERVWAVVDV SIDASQDAAS AQMFPNCRRL  420
PSGCVIQDMD NKYSKVTWVE HSEYDDSAVH HLLRPLLSYG FGFGAHRWLA TLQRQCDCLA  480
VLMSPNIPGE ENTGITPAGR KNMLKLAQRM TYNFCAGVCA SSVHKWDKLS VGNVGEDVRV  540
MTRKNIDDPG EPAGVVLSAA TSVWMPITQQ RLFDFLRDER MRSQWDILSN GGPMQGMVKI  600
AKGPGHGNCV SLLRGSAINA NENNMLILQE TWSDASGALV VYAPVDISSI GVVMNGGDSA  660
YVALLPSGFA ILPGISPSYH GGQSNSNGPM VKPDIDGSIS GGCLLTVGFQ ILVNSLPTAK  720
LTVESVETVN NLISCTIQKI KAALTVT*
Regulation -- PlantRegMap ? help Back to Top
Source Upstream Regulator Target Gene
PlantRegMapRetrieve-
Annotation -- Protein ? help Back to Top
Source Hit ID E-value Description
RefseqXP_007038621.10.0Homeobox-leucine zipper family protein / lipid-binding START domain-containing protein isoform 1
SwissprotQ0WV120.0ANL2_ARATH; Homeobox-leucine zipper protein ANTHOCYANINLESS 2
TrEMBLA0A061G1P90.0A0A061G1P9_THECC; Homeobox-leucine zipper family protein / lipid-binding START domain-containing protein isoform 1
STRINGGLYMA20G29580.20.0(Glycine max)
Best hit in Arabidopsis thaliana ? help Back to Top
Hit ID E-value Description
AT4G00730.10.0HD-ZIP family protein
Publications ? help Back to Top
  1. Motamayor JC, et al.
    The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.
    Genome Biol., 2013. 14(6): p. r53
    [PMID:23731509]